-
Notifications
You must be signed in to change notification settings - Fork 407
Always print why we disconnect explicitly #3309
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Always print why we disconnect explicitly #3309
Conversation
...and drop the old, confusing, "protocol error" message when we get a duplicate connection from a peer. This should slightly further disambiguate disconnection reasons, which are a key debugging tool for assigning blame.
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #3309 +/- ##
==========================================
- Coverage 89.64% 89.60% -0.04%
==========================================
Files 126 126
Lines 102170 102170
Branches 102170 102170
==========================================
- Hits 91590 91554 -36
- Misses 7857 7889 +32
- Partials 2723 2727 +4 ☔ View full report in Codecov by Sentry. |
@@ -2467,7 +2466,8 @@ impl<Descriptor: SocketDescriptor, CM: Deref, RM: Deref, OM: Deref, L: Deref, CM | |||
Some(peer_lock) => { | |||
let peer = peer_lock.lock().unwrap(); | |||
if let Some((node_id, _)) = peer.their_node_id { | |||
log_trace!(WithContext::from(&self.logger, Some(node_id), None, None), "Handling disconnection of peer {}", log_pubkey!(node_id)); | |||
let logger = WithContext::from(&self.logger, Some(node_id), None, None); | |||
log_trace!(logger, "Handling disconnection of peer {} because {}", log_pubkey!(node_id), reason); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: Looks like we include the node_id in the logger context and the log itself, which is a bit redundant
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yea, we do that in a lot of places very much on purpose - not all our users (yet) print context info in their logs (hell, LDK-sample doesn't, even) so if we get logs from users we don't want to be super confused even if we're missing context.
@@ -1333,8 +1333,7 @@ impl<Descriptor: SocketDescriptor, CM: Deref, RM: Deref, OM: Deref, L: Deref, CM | |||
match self.do_read_event(peer_descriptor, data) { | |||
Ok(res) => Ok(res), | |||
Err(e) => { | |||
log_trace!(self.logger, "Disconnecting peer due to a protocol error (usually a duplicate connection)."); | |||
self.disconnect_event_internal(peer_descriptor); | |||
self.disconnect_event_internal(peer_descriptor, "of a protocol error"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
how does this "drop protocol error
message when we get a duplicate connection from a peer." ?
I don't see any handling change related to duplicate connection.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, sorry, its very subtle - In do_read_event
if we hit a duplicate peer we'll unset their_node_id
before returning the error. Then, we'll call disconnect_event_internal
which won't print anything if their_node_id
is None
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, ok i would have never guessed it.
maybe should have been a comment on top of "if let Some((node_id, _)) = peer.their_node_id".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, probably should have explained it in the commit message, though not entirely sure why we'd comment it - logging without a peer id doesn't make much sense (what are we telling the user?), and we already log in do_read_event
saying that its a second connection.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
because it also has a hidden behaviour, don't log when "duplicate connection from a peer" because their_node_id
is None
. Maybe not the right place for it though.
@@ -1333,8 +1333,7 @@ impl<Descriptor: SocketDescriptor, CM: Deref, RM: Deref, OM: Deref, L: Deref, CM | |||
match self.do_read_event(peer_descriptor, data) { | |||
Ok(res) => Ok(res), | |||
Err(e) => { | |||
log_trace!(self.logger, "Disconnecting peer due to a protocol error (usually a duplicate connection)."); | |||
self.disconnect_event_internal(peer_descriptor); | |||
self.disconnect_event_internal(peer_descriptor, "of a protocol error"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, ok i would have never guessed it.
maybe should have been a comment on top of "if let Some((node_id, _)) = peer.their_node_id".
...and drop the old, confusing, "protocol error" message when we get a duplicate connection from a peer.
This should slightly further disambiguate disconnection reasons, which are a key debugging tool for assigning blame.